tokfandomcom-20200215-history
Fast multiplication
Booth's multiplication algorithm :See also: Booth's multiplication algorithm is a multiplication algorithm that multiplies two signed binary numbers in two's complement notation. Consider a positive multiplier consisting of a block of 1s surrounded by 0s. For example, 00111110. The product is given by: : M \times \,^{\prime\prime} 0 \; 0 \; 1 \; 1 \; 1 \; 1 \; 1 \; 0 \,^{\prime\prime} = M \times (2^5 + 2^4 + 2^3 + 2^2 + 2^1) = M \times 62 where M is the multiplicand. The number of operations can be reduced to two by rewriting the same as : M \times \,^{\prime\prime} 0 \; 1 \; 0 \;0 \;0 \; 0 \mbox{-1} \; 0 \,^{\prime\prime} = M \times (2^6 - 2^1) = M \times 62. In fact, it can be shown that any sequence of 1s in a binary number can be broken into the difference of two binary numbers: : (\ldots 0 \overbrace{1 \ldots 1}^{n} 0 \ldots)_{2} \equiv (\ldots 1 \overbrace{0 \ldots 0}^{n} 0 \ldots)_{2} - (\ldots 0 \overbrace{0 \ldots 1}^{n} 0 \ldots)_2. Hence, the multiplication can actually be replaced by the string of ones in the original number by simpler operations, adding the multiplier, shifting the partial product thus formed by appropriate places, and then finally subtracting the multiplier. It is making use of the fact that it is not necessary to do anything but shift while dealing with 0s in a binary multiplier, and is similar to using the mathematical property that 99 = 100 − 1 while multiplying by 99. This scheme can be extended to any number of blocks of 1s in a multiplier (including the case of a single 1 in a block). Thus, : M \times \,^{\prime\prime} 0 \; 0 \; 1 \; 1 \; 1 \; 0 \; 1 \; 0 \,^{\prime\prime} = M \times (2^5 + 2^4 + 2^3 + 2^1) = M \times 58 : M \times \,^{\prime\prime} 0 \; 1 \; 0 \; 0 \mbox{-1} \; 1 \mbox{-1} \; 0 \,^{\prime\prime} = M \times (2^6 - 2^3 + 2^2 - 2^1) = M \times 58. Booth's algorithm follows this old scheme by performing an addition when it encounters the first digit of a block of ones (0 1) and a subtraction when it encounters the end of the block (1 0). This works for a negative multiplier as well. When the ones in a multiplier are grouped into long blocks, Booth's algorithm performs fewer additions and subtractions than the normal multiplication algorithm. In decimal : \begin{align} 999 \times 999 &= (1000-1)(1000-1)\\ &= 1,000,000 - 2000 + 1\\ &= 998,001 \end{align} Complex multiplication algorithm Complex multiplication normally involves four multiplications and two additions. : (a+bi) (c+di) = (ac-bd) + (bc+ad)i. Or : \begin{array}{c|c|c} \times & a & bi \\ \hline c & ac & bci \\ \hline di & adi & -bd \end{array} But there is a way of reducing the number of multiplications to three. The product (a'' + ''bi) · (c'' + ''di) can be calculated in the following way. :k''1 = ''c · (a'' + ''b) :k''2 = ''a · (d'' - ''c) :k''3 = ''b · (c'' + ''d) :Real part = k''1 - ''k''3 :Imaginary part = ''k''1 + ''k''2. This algorithm uses only three multiplications, rather than four, and five additions or subtractions rather than two. If a multiply is more expensive than three adds or subtracts, as when calculating by hand, then there is a gain in speed. On modern computers a multiply and an add can take about the same time so there may be no speed gain. There is a trade-off in that there may be some loss of precision when using floating point. For s (FFTs) (or any ) the complex multiplies are by constant coefficients ''c + di (called s in FFTs), in which case two of the additions (d''-''c and c''+''d) can be precomputed. Hence, only three multiplies and three adds are required. However, trading off a multiplication for an addition in this way may no longer be beneficial with modern s. Karatsuba algorithm The Karatsuba algorithm is a fast multiplication algorithm. It was discovered by Anatoly Karatsuba in 1960 and published in 1962. It is a generalization of the complex multiplication algorithm above, where the imaginary unit i is replaced by a power of the base. It reduces the multiplication of two n''-digit numbers to at most n^{\log_23}\approx n^{1.58} single-digit multiplications in general (and exactly n^{\log_23} when ''n is a power of 2). It is therefore faster than the , which requires n^2 single-digit products. :For example, the Karatsuba algorithm requires 310 = 59,049 single-digit multiplications to multiply two 1024-digit numbers (n'' = 1024 = 210), whereas the classical algorithm requires (210)2 = 1,048,576 (a speedup of 17.75 times). The Karatsuba algorithm was the first multiplication algorithm asymptotically faster than the quadratic "grade school" algorithm. Example The basic step of Karatsuba's algorithm is a formula that allows one to compute the product of two large numbers x and y using three multiplications of smaller numbers, each with about half as many digits as x or y , plus some additions and digit shifts. To compute the product of 12345 and 6789 we decompose the input operands as: : 12345 = '''12' · 1000 + 345 : 6789 = 6''' · 1000 + '''789 Only three multiplications, which operate on smaller integers, are used to compute three partial results: : z''2 = '''12' ×''' '''6 = 72 : z''0 = '''345' ×''' '''789 = 272205 : z''1 = ('12''' + 345) ×''' ('''6 + 789) − z''2 − ''z''0 = 357 '×''' 795 − 72 − 272205 = 283815 − 72 − 272205 = 11538 We get the final result by just adding these three partial results, shifted accordingly: : result = z''2 · (''Bm'')''2 + z''1 · (''Bm'')''1 + z''0 · (''Bm'')''0, i.e. : result = 72 · 1000''2 + 11538 · ''1000 + 272205 = 83810205. Explanation Let x and y be represented as n -digit strings in some base B . For any positive integer m less than n , one can write the two given numbers as : x = x_1 B^m + x_0, : y = y_1 B^m + y_0, where x_0 and y_0 are less than B^m . of x and y requires four multiplications: : xy = (x_1 B^m + x_0)(y_1 B^m + y_0), : xy = x_1 y_1 B^{2m} + x_1 y_0 B^m + x_0 y_1 B^m + x_0 y_0 Karatsuba observed that xy can be computed in only , at the cost of a few extra additions. : xy = x_1 y_1 B^{2m} + x_1 y_0 B^m + x_0 y_1 B^m + x_0 y_0 : xy = x_1 y_1 B^{2m} + (x_1 y_0 + x_0 y_1 ) B^m + x_0 y_0 : xy = z_2 B^{2m} + z_1 B^m + z_0, where : z_2 = {\color{red}x_1 y_1} : z_1 = x_1 y_0 + x_0 y_1 : z_0 = {\color{red}x_0 y_0} and : z_1 = {\color{red}(x_1 + x_0) (y_1 + y_0)} - z_2 - z_0 : z_1 = x_1 y_1 + x_1 y_0 + x_0 y_1 + x_0 y_0 - x_1 y_1 - x_0 y_0 : z_1 = x_1 y_0 + x_0 y_1 If n is four or more, the three multiplications in Karatsuba's basic step can be computed by recursive calls of the Karatsuba algorithm. The recursion can be applied until the numbers are so small that they can (or must) be computed directly. See . Issue With z_0 and z_2 as before one can observe that : z_1 = (x_1 + x_0)(y_1 + y_0) - z_2 - z_0. An issue that occurs, however, when computing z_1 is that the above computation of (x_1 + x_0) and (y_1 + y_0) may result in overflow (will produce a result in the range 0 \leq \text{result} < 2 B^m ), which require a multiplier having one extra bit. This can be avoided by noting that : z_1 = (x_0 - x_1)(y_1 - y_0) + z_2 + z_0. This computation of (x_0 - x_1) and (y_1 - y_0) will produce a result in the range of -B^m < \text{result} < B^m . This method may produce negative numbers, which require one extra bit to encode signedness, and would still require one extra bit for the multiplier. However, one way to avoid this is to record the sign and then use the absolute value of (x_0 - x_1) and (y_1 - y_0) to perform an unsigned multiplication, after which the result may be negated when both signs originally differed. Another advantage is that even though (x_0 - x_1)(y_1 - y_0) may be negative, the final computation of z_1 only involves additions. See also * * References Category:Computer science